92 research outputs found
Detection of Malicious and Low Throughput Data Exfiltration Over the DNS Protocol
In the presence of security countermeasures, a malware designed for data
exfiltration must do so using a covert channel to achieve its goal. Among
existing covert channels stands the domain name system (DNS) protocol. Although
the detection of covert channels over the DNS has been thoroughly studied in
the last decade, previous research dealt with a specific subclass of covert
channels, namely DNS tunneling. While the importance of tunneling detection is
not undermined, an entire class of low throughput DNS exfiltration malware
remained overlooked. The goal of this study is to propose a method for
detecting both tunneling and low-throughput data exfiltration over the DNS.
Towards this end, we propose a solution composed of a supervised feature
selection method, and an interchangeable, and adjustable anomaly detection
model trained on legitimate traffic. In the first step, a one-class classifier
is applied for detecting domain-specific traffic that does not conform with the
normal behavior. Then, in the second step, in order to reduce the false
positive rate resulting from the attempt to detect the low-throughput data
exfiltration we apply a rule-based filter that filters data exchange over DNS
used by legitimate services. Our solution was evaluated on a medium-scale
recursive DNS server logs, and involved more than 75,000 legitimate uses and
almost 2,000 attacks. Evaluation results shows that while DNS tunneling is
covered with at least 99% recall rate and less than 0.01% false positive rate,
the detection of low throughput exfiltration is more difficult. While not
preventing it completely, our solution limits a malware attempting to avoid
detection with at most a 1kb/h of payload under the limitations of the DNS
syntax (equivalent to five credit cards details, or ten user credentials per
hour) which reduces the effectiveness of the attack.Comment: 5 figs. 7 table
MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses
Domain generation algorithms (DGAs) are commonly used by botnets to generate
domain names through which bots can establish a resilient communication channel
with their command and control servers. Recent publications presented deep
learning, character-level classifiers that are able to detect algorithmically
generated domain (AGD) names with high accuracy, and correspondingly,
significantly reduce the effectiveness of DGAs for botnet communication. In
this paper we present MaskDGA, a practical adversarial learning technique that
adds perturbation to the character-level representation of algorithmically
generated domain names in order to evade DGA classifiers, without the attacker
having any knowledge about the DGA classifier's architecture and parameters.
MaskDGA was evaluated using the DMD-2018 dataset of AGD names and four recently
published DGA classifiers, in which the average F1-score of the classifiers
degrades from 0.977 to 0.495 when applying the evasion technique. An additional
evaluation was conducted using the same classifiers but with adversarial
defenses implemented: adversarial re-training and distillation. The results of
this evaluation show that MaskDGA can be used for improving the robustness of
the character-level DGA classifiers against adversarial attacks, but that
ideally DGA classifiers should incorporate additional features alongside
character-level features that are demonstrated in this study to be vulnerable
to adversarial attacks.Comment: 12 pages, 2 figure
Detecting Cyberattacks in Industrial Control Systems Using Convolutional Neural Networks
This paper presents a study on detecting cyberattacks on industrial control
systems (ICS) using unsupervised deep neural networks, specifically,
convolutional neural networks. The study was performed on a SecureWater
Treatment testbed (SWaT) dataset, which represents a scaled-down version of a
real-world industrial water treatment plant. e suggest a method for anomaly
detection based on measuring the statistical deviation of the predicted value
from the observed value.We applied the proposed method by using a variety of
deep neural networks architectures including different variants of
convolutional and recurrent networks. The test dataset from SWaT included 36
different cyberattacks. The proposed method successfully detects the vast
majority of the attacks with a low false positive rate thus improving on
previous works based on this data set. The results of the study show that 1D
convolutional networks can be successfully applied to anomaly detection in
industrial control systems and outperform more complex recurrent networks while
being much smaller and faster to train.Comment: Proceedings of the 2018 Workshop on Cyber-Physical Systems Security
and PrivaC
Deployment Optimization of IoT Devices through Attack Graph Analysis
The Internet of things (IoT) has become an integral part of our life at both
work and home. However, these IoT devices are prone to vulnerability exploits
due to their low cost, low resources, the diversity of vendors, and proprietary
firmware. Moreover, short range communication protocols (e.g., Bluetooth or
ZigBee) open additional opportunities for the lateral movement of an attacker
within an organization. Thus, the type and location of IoT devices may
significantly change the level of network security of the organizational
network. In this paper, we quantify the level of network security based on an
augmented attack graph analysis that accounts for the physical location of IoT
devices and their communication capabilities. We use the depth-first branch and
bound (DFBnB) heuristic search algorithm to solve two optimization problems:
Full Deployment with Minimal Risk (FDMR) and Maximal Utility without Risk
Deterioration (MURD). An admissible heuristic is proposed to accelerate the
search. The proposed method is evaluated using a real network with simulated
deployment of IoT devices. The results demonstrate (1) the contribution of the
augmented attack graphs to quantifying the impact of IoT devices deployed
within the organization on security, and (2) the effectiveness of the optimized
IoT deployment
MDGAN: Boosting Anomaly Detection Using \\Multi-Discriminator Generative Adversarial Networks
Anomaly detection is often considered a challenging field of machine learning
due to the difficulty of obtaining anomalous samples for training and the need
to obtain a sufficient amount of training data. In recent years, autoencoders
have been shown to be effective anomaly detectors that train only on "normal"
data. Generative adversarial networks (GANs) have been used to generate
additional training samples for classifiers, thus making them more accurate and
robust. However, in anomaly detection GANs are only used to reconstruct
existing samples rather than to generate additional ones. This stems both from
the small amount and lack of diversity of anomalous data in most domains. In
this study we propose MDGAN, a novel GAN architecture for improving anomaly
detection through the generation of additional samples. Our approach uses two
discriminators: a dense network for determining whether the generated samples
are of sufficient quality (i.e., valid) and an autoencoder that serves as an
anomaly detector. MDGAN enables us to reconcile two conflicting goals: 1)
generate high-quality samples that can fool the first discriminator, and 2)
generate samples that can eventually be effectively reconstructed by the second
discriminator, thus improving its performance. Empirical evaluation on a
diverse set of datasets demonstrates the merits of our approach
Content-based data leakage detection using extended fingerprinting
Protecting sensitive information from unauthorized disclosure is a major
concern of every organization. As an organizations employees need to access
such information in order to carry out their daily work, data leakage detection
is both an essential and challenging task. Whether caused by malicious intent
or an inadvertent mistake, data loss can result in significant damage to the
organization. Fingerprinting is a content-based method used for detecting data
leakage. In fingerprinting, signatures of known confidential content are
extracted and matched with outgoing content in order to detect leakage of
sensitive content. Existing fingerprinting methods, however, suffer from two
major limitations. First, fingerprinting can be bypassed by rephrasing (or
minor modification) of the confidential content, and second, usually the whole
content of document is fingerprinted (including non-confidential parts),
resulting in false alarms. In this paper we propose an extension to the
fingerprinting approach that is based on sorted k-skip-n-grams. The proposed
method is able to produce a fingerprint of the core confidential content which
ignores non-relevant (non-confidential) sections. In addition, the proposed
fingerprint method is more robust to rephrasing and can also be used to detect
a previously unseen confidential document and therefore provide better
detection of intentional leakage incidents
Analysis of Location Data Leakage in the Internet Traffic of Android-based Mobile Devices
In recent years we have witnessed a shift towards personalized, context-based
applications and services for mobile device users. A key component of many of
these services is the ability to infer the current location and predict the
future location of users based on location sensors embedded in the devices.
Such knowledge enables service providers to present relevant and timely offers
to their users and better manage traffic congestion control, thus increasing
customer satisfaction and engagement. However, such services suffer from
location data leakage which has become one of today's most concerning privacy
issues for smartphone users. In this paper we focus specifically on location
data that is exposed by Android applications via Internet network traffic in
plaintext (i.e., without encryption) without the user's awareness. We present
an empirical evaluation, involving the network traffic of real mobile device
users, aimed at: (1) measuring the extent of location data leakage in the
Internet traffic of Android-based smartphone devices; and (2) understanding the
value of this data by inferring users' points of interests (POIs). This was
achieved by analyzing the Internet traffic recorded from the smartphones of a
group of 71 participants for an average period of 37 days. We also propose a
procedure for mining and filtering location data from raw network traffic and
utilize geolocation clustering methods to infer users' POIs. The key findings
of this research center on the extent of this phenomenon in terms of both
ubiquity and severity; we found that over 85\% of devices of users are leaking
location data, and the exposure rate of users' POIs, derived from the
relatively sparse leakage indicators, is around 61%.Comment: 11 pages, 10 figure
Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection
Neural networks have become an increasingly popular solution for network
intrusion detection systems (NIDS). Their capability of learning complex
patterns and behaviors make them a suitable solution for differentiating
between normal traffic and network attacks. However, a drawback of neural
networks is the amount of resources needed to train them. Many network gateways
and routers devices, which could potentially host an NIDS, simply do not have
the memory or processing power to train and sometimes even execute such models.
More importantly, the existing neural network solutions are trained in a
supervised manner. Meaning that an expert must label the network traffic and
update the model manually from time to time.
In this paper, we present Kitsune: a plug and play NIDS which can learn to
detect attacks on the local network, without supervision, and in an efficient
online manner. Kitsune's core algorithm (KitNET) uses an ensemble of neural
networks called autoencoders to collectively differentiate between normal and
abnormal traffic patterns. KitNET is supported by a feature extraction
framework which efficiently tracks the patterns of every network channel. Our
evaluations show that Kitsune can detect various attacks with a performance
comparable to offline anomaly detectors, even on a Raspberry PI. This
demonstrates that Kitsune can be a practical and economic NIDS.Comment: Appears in Network and Distributed Systems Security Symposium (NDSS)
201
Can't Boil This Frog: Robustness of Online-Trained Autoencoder-Based Anomaly Detectors to Adversarial Poisoning Attacks
In recent years, a variety of effective neural network-based methods for
anomaly and cyber attack detection in industrial control systems (ICSs) have
been demonstrated in the literature. Given their successful implementation and
widespread use, there is a need to study adversarial attacks on such detection
methods to better protect the systems that depend upon them. The extensive
research performed on adversarial attacks on image and malware classification
has little relevance to the physical system state prediction domain, which most
of the ICS attack detection systems belong to. Moreover, such detection systems
are typically retrained using new data collected from the monitored system,
thus the threat of adversarial data poisoning is significant, however this
threat has not yet been addressed by the research community. In this paper, we
present the first study focused on poisoning attacks on online-trained
autoencoder-based attack detectors. We propose two algorithms for generating
poison samples, an interpolation-based algorithm and a back-gradient
optimization-based algorithm, which we evaluate on both synthetic and
real-world ICS data. We demonstrate that the proposed algorithms can generate
poison samples that cause the target attack to go undetected by the autoencoder
detector, however the ability to poison the detector is limited to a small set
of attack types and magnitudes. When the poison-generating algorithms are
applied to the popular SWaT dataset, we show that the autoencoder detector
trained on the physical system state data is resilient to poisoning in the face
of all ten of the relevant attacks in the dataset. This finding suggests that
neural network-based attack detectors used in the cyber-physical domain are
more robust to poisoning than in other problem domains, such as malware
detection and image processing
Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
In this paper, we present a generic, query-efficient black-box attack against
API call-based machine learning malware classifiers. We generate adversarial
examples by modifying the malware's API call sequences and non-sequential
features (printable strings), and these adversarial examples will be
misclassified by the target malware classifier without affecting the malware's
functionality. In contrast to previous studies, our attack minimizes the number
of malware classifier queries required. In addition, in our attack, the
attacker must only know the class predicted by the malware classifier; attacker
knowledge of the malware classifier's confidence score is optional. We evaluate
the attack effectiveness when attacks are performed against a variety of
malware classifier architectures, including recurrent neural network (RNN)
variants, deep neural networks, support vector machines, and gradient boosted
decision trees. Our attack success rate is around 98% when the classifier's
confidence score is known and 64% when just the classifier's predicted class is
known. We implement four state-of-the-art query-efficient attacks and show that
our attack requires fewer queries and less knowledge about the attacked model's
architecture than other existing query-efficient attacks, making it practical
for attacking cloud-based malware classifiers at a minimal cost.Comment: Accepted as a conference paper at ACSAC 202
- …